PNAS Nexus
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
The emergence of variants has shaped the COVID-19 pandemic. The lack of directly observed precursors to these variants has led to proposals that variants emerge from either persistent infections, transmission in non-human animal populations after reverse-zoonosis, or cryptic transmission in the human population. We investigated the origin of variants by analyzing the molecular clock and rate of nonsynonymous and synonymous substitutions in SARS-CoV-2 circulating in human population, persistently...
Show abstract
Reunion island just experienced a massive chikungunya virus outbreak in 2024-2025, with more than 54,000 confirmed cases. This is the second major chikungunya outbreak on the island, following the first one that peaked 20 years ago. It has been assessed that this new outbreak finds its origin in a single introduction event into the island, offering a unique opportunity to exploit viral genomic data to understand the epidemiological and dispersal dynamics of the introduced transmission chain. We ...
Show abstract
1.Disease-specific Polygenic Risk Scores (PRS) are usually evaluated against the incidence of diseases they were derived for. Individuals may be more interested in how these PRS influence their probable cause of death. Using UK Biobank data, we examined the top 10 causes of death among individuals in the highest quintile of disease-specific PRS for Alzheimers disease, bowel cancer, cardiovascular disease, coronary artery disease, ischaemic stroke, breast cancer, epithelial ovarian cancer, and pr...
Show abstract
Biological fitness quantifies the efficiency and selective advantage of pathogens and hosts in their bilateral interaction. Key questions--such as how much more infectious an emerging variant is compared with its predecessor, or how much protection vaccination offers relative to no vaccination--require fitness to be measured systematically, in real time, and ideally beyond controlled laboratory settings. We propose an approach that infers biological fitness from mostly non-biological data on inf...
Show abstract
IntroductionLow-dose computed tomography (LDCT) lung cancer screening has significantly enhanced early detection and patient survival rates in the population at risk. Current screening methods, that primarily rely on LDCT imaging, will very likely benefit from molecular biomarkers to achieve a more comprehensive, accurate, personalized and non-invasive risk assessment leveraging multimodal tools. We present a novel open access multimodal (imaging, proteomics and demographic) dataset designed to ...
Show abstract
The recent SARS-CoV-2 pandemic has highlighted the growing importance of infectious disease analysis. An accurate and robust model can empower public health leaders to make timely decisions on social distancing and vaccination policies, thereby reducing the number of cases, hospitalizations and deaths. However, the emergence of new variants and subvariants can significantly alter the transmissibility, immune escape capacity and virulence of the pathogen in a short time, making the number of case...
Show abstract
Immunotherapy with immune checkpoint inhibitors and immunotherapy combined with chemotherapy have represented promising treatments for NSCLC patients leading to prolonged survival. However, the majority of patients with advanced NSCLC have a poor prognosis. The identification and development of biomarkers for stratifying responders and non responders to immune checkpoint inhibitors contribute to unravel the mechanism of immune checkpoint pathway and the immune tumor interaction underlying the re...
Show abstract
PurposePatient-reported outcomes (PROs) provide a quantitative measure of a patients quality of life, directly from the patient without external influence or interpretation. Prior studies have demonstrated correlations between individual PROs and cancer treatment response. However, this area of research is still highly understudied, and patient data often goes ignored. Our previous work has shown how changes in insomnia can be used to make binary decisions about a patients future volume response...
Show abstract
SARS-CoV-2 mutations play a key role in viral evolution, in immune escape, and potentially in disease severity. However, the clinical impact of most mutations remains poorly understood, particularly across different variants. A historical observational study was conducted using SARS-CoV-2 whole-genome sequencing data linked to clinical metadata from 175,503 COVID-19 cases in Israel. The dataset was stratified into four variant-specific periods: B.1.1.7, B.1.617.2, BA.1, and BA.2. Logistic regres...
Show abstract
Despite a decade of immunotherapy, treatment selection in non-small cell lung cancer (NSCLC) still relies on subgroup analyses and clinical scores. I3LUNG (NCT05537922) is currently the largest international, real-world, multimodal, artificial intelligence (AI)-based trial, enrolling 2365 patients. We integrated real-world clinical data (RWD), computed tomography (CT) images, digital pathology (DP), and genomics (G) into machine learning early-fusion (MLEF) and deep-learning intermediate-fusion ...
Show abstract
IntroductionThe precise determination of diagnostic cut-off points is essential for the development of multimarker panels in oncology. In previous work on pulmonary nodules, it was observed that the standard two-parameter logistic fit could be insufficient for biomarkers with asymmetric distributions. Furthermore, the calculation of empirical cut-off points based on graphical visualization presented limitations in precision and reproducibility. ObjectiveThis study presents a methodological adva...
Show abstract
IntroductionDespite advancements in non-small cell lung cancer (NSCLC) management through the use of molecular biomarkers, the recently introduced 9th edition of the TNM staging system remains based exclusively on anatomic descriptors, with no consistently demonstrated improvement in risk stratification for early-stage disease. This study explores the integration of a molecular prognostic classifier into the conventional TNM staging system. MethodsWe analyzed 502 patients with stage I-III lung ...
Show abstract
Robust early warning of emerging viruses requires sampling populations that drive spread coupled with assays capable of detecting new viral variants or species. Untargeted viral metagenomic sequencing can, in principle, detect any virus, including completely novel ones. Composite airplane wastewater enables monitoring long-distance travelers from central collection points; however, the performance of untargeted viral metagenomic sequencing in this sample type remains unknown. In municipal wastew...
Show abstract
The project aimed to develop a data-driven approach for predicting platelet recovery in cancer treatment-induced thrombocytopenia (CTIT) patients receiving recombinant human thrombopoietin (Rh-TPO). By integrating key clinical indicators into a predictive modeling framework, the study sought to enhance understanding of individual treatment responses and facilitate timely clinical decision-making. A retrospective two-stage modeling analysis was conducted on 400 hospitalized CTIT patients who rece...
Show abstract
Wastewater-based surveillance (WBS) is widely used to monitor respiratory viruses, yet uncertainties remain regarding how viral RNA concentrations in wastewater reflect infection dynamics. Specifically, diurnal variation in shedding and RNA losses during in-sewer transport can impact measured signals. We conducted a field study in a 5-km trunk sewer (travel time of one hour). Wastewater was sampled at the sewer inlet and outlet using autosamplers collecting time-proportional one-hour composite s...
Show abstract
AO_SCPLOWBSTRACTC_SCPLOWLung cancer is characterized by profound intratumoral and inter-patient heterogeneity, spanning histological subtypes, molecular landscapes, and the tumor microenvironment. While multi-omics integration is essential for capturing this complexity, leveraging these data to explicitly define survival-associated subpopulations remains a significant challenge. In this study, we developed NeuroMDAVIS-FS, an unsupervised deep learning framework designed to stratify lung cancer p...
Show abstract
Wastewater surveillance has been widely adopted since the COVID-19 pandemic, but non-sewered or onsite sanitation is a common form of sanitation in cities of low- and middle-income countries. Environmental surveillance in these settings requires expanding analyses beyond wastewater. We collected 81 soil samples adjacent to public waste bins inside the sewered and non-sewered areas of Maputo and a 150-meter-wide buffer zone between the two areas, as well as from subsistence farms near the wastewa...
Show abstract
BackgroundRapid emergence and replacement of SARS-CoV-2 variants underscore the need for early and reliable indicators of variant dominance to guide timely public health response. However, early genomic trajectories are typically short, sparse, and noisy, with strong fluctuations and substantial cross-country heterogeneity in sequencing intensity and reporting. MethodsWe develop a scalable forecasting framework that predicts whether new variants will reach high prevalence and how long they will...
Show abstract
IntroductionPrecision oncology has informed cancer care by enabling the discovery and application of diagnostic, prognostic, and/or predictive molecular biomarkers. However, many patients lack actionable biomarkers or fail to respond to biomarker-directed therapies. Patient similarity approaches can leverage comprehensive tumor profiling and prior clinical experiences from large cohorts for decision support, facilitating broader realization of precision oncology insights. MethodsWe developed a ...
Show abstract
Applying deep learning models to RNA-Seq data poses substantial challenges, primarily due to the high dimensionality of the data and the limited sample sizes. To address these issues, this study introduces an advanced deep learning pipeline that integrates feature engineering with data augmentation. The engineering application focuses on biomedical engineering, specifically the classification of RNA-Seq datasets for disease diagnosis. The proposed framework was initially validated on synthetic d...